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I.  INTRODUCTION 


A,)  Background 

The  Cooper-Harper  pilot  rating  scale  for  the  evaluation  of  aircraft 
has  found  wide  acceptance  in  the  field  of  handling  qualities  research. 

The  scale,  shown  in  Figure  1,  is  a  means  of  quantifying  a  pilot’s 
impressions  of  the  handling  qualities  of  an  aircraft  which  is  involved 
in  a  specific  mission  element  or  task.  The  scale  is  adjectival,  ordinal 
and  nonlinear  in  nature.  It  is  adjectival  in  that  descriptors  such  as 
"controllable’%  ’’adequate’^  and  ’’satisfactory”  appear  in  the  flow  diaigram 
used  by  the  pilot.  It  is  ordinal  in  that  handling  qualities  are  ranked 
in  order  of  decreasing  acceptability.  It  is  nonlinear  in  that  a  rating 
of,  say  8,  does  not  necessarily  indicate  handling  qualities  which  are 
twice  as  unacceptable  as  those  receiving  a  rating  of  4,  The  utility  of 
the  Cooper-Harper  scale  has  been  recently  enhanced  by  a  method  for 
predicting  ratings^’^. 

As  successful  and  usef\il  as  this  rating  scale  has  been  it  is  not 
without  its  weaknesses.  Chief  among  these  are  its  qualitative  character 
and  its  ordinal  nature.  In  an  attempt  to  alleviate  some  of  these 
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difficulties,  J.  D.  McDonnell  proposed  a  ’’global”  rating  scale  for 
handling  qualities  investigations.  This  scale,  shown  in  Figure  2,  is  an 
adjectival,  nonordinal,  linear  scale  developed  through  the  methods  of 
psychometrics.  While  not  receiving  the  wide  acceptance  of  the  Cooper- 
Harper  scale,  the  Global  scale  has  been  utilized  in  handling  qualities 
investigations^ , 
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McDonnell’s  work  centered  about  finding  the  coordinates  of  certain 
adjectival  phrases  on  a  psychological  continuum  which  he  called  the  t 
scale.  The  adjectival  phrases  were  those  most  commonly  encountered  in 
handling  qualities  research. 

The  psychological  continuum  can  be  interprested  in  the  fo3J-owing 
manner.  If  a  measurement  is  made  on  a  physical  object  with  a  nonhuman 
instrument  of  some  sort,  the  measure  is  an  objective  one  and  the  resulting 
data  lie  along  a  physical  continuum.  When  a  human  observer  estimates  a 
measure,  it  is  a  subjective  judgment  and  the  estimates  lie  along  a 
psychological  continuum. 

C.  V.  Schufeldt^  advanced  yet  another  rating  scale.  His  scale, 
shown  in  one  of  its  forms  in  Figure  3>  is  nonadjectival,  nonordinal, 
and  linear  in  nature.  The  impetus  behind  Schufeldt’s  research  was  the 
idea  of  developing  a  scale  \diich  would  reflect  relatively  minor  differences 
in  system  characteristics.  To  accomplish  this,  the  scale  would  have  to 
exhibit  a  good  deal  of  sensitivity  without  overtaxing  the  resolution 
capability  of  the  operator.  Schufeldt’s  hypothesis  was  that  a  linear 
rating  scale  coincident  with  the  psychological  continuum  begets  such 
sensitivity.  While  the  Global  scale  of  McDonnell  is  conceptually  close 
to  this  realization,  Schefeldt  felt  that  in  certain  applications,  the 
adjectives  were  a  hindrance.  He  wanted  to  know  if  removing  the  adjectives 
would  allow  the  rater  to  transpose  his  impressions  of  a  system  directly 
to  a  linear,  numerical  index.  In  addition,  he  wondered  if  allowing  the 
subject  to  fractionize  his  rating  woiold  increase  scale  sensitivity. 

Schufeldt  investigated  his  hypothesis  by  submitting  a  child’s 
puzzle  (’’EVEN- STEVEN”  by  Kohner)  to  some  thirty  students  in  the  Department 
of  Aeronautics.  Upon  successful  solution  of  the  puzzle,  or  at  the 

expiration  of  an  allot ed  time,  whichever  occurred  first,  the  subject  was 
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asked  to  rate  his  in5)ression  of  the  difficulty  he  encountered  in  working 
the  puzzle.  The  subjects  indicated  their  ratings  on  three  different 
scales,  one  of  which  is  shown  in  Figure  3.  Schufeldt  found  a  high 
correlation  coefficient  (e.g.  O.928  for  the  scale  of  Figure  3)  between 
ratings  and  performance. 

B.)  Critical-Subcritical  Tasks 

Encouraged  by  Schufeldt *s  results,  this  author  was  eager  to  use 
the  scale  in  an  environment  more  closely  related  to  handling  qualities 
investigations,  i.e.  fixed  base  tracking  tasks. 

If  Schufeldt *s  scale  does  indeed  posses  a  sensitivity  superior  to 
previous  scales,  it  should  yeild  better  results  in  areas  where  these 
scales  were  overly  sensitive,  i.e.  the  high  end  (8-IO)  of  the  Cooper- 
Harper  scale.  If  the  experiment  is  to  be  tractable,  the  task  difficulty 
should  be  controlled  by  as  few  parameters  as  possible.  Finally,  since 
it  was  desired  to  keep  the  duration  of  the  entire  experimental  program 
short,  a  task  which  tended  to  minimize  training  times  should  be  selected. 
These  criteria  pointed  toward  the  selection  of  the  ’’critical-subcritical" 
tracking  tasks  as  pioneered  by  Jex,  McDonnell,  and  Phatak^. 

Critical  task  (first-order)  refers  to  a  special  compensatory  tracking 
task  in  which  the  real  pole,  X.,  of  a  first  order  controlled  element 


is  moved  slowly  into  the  right  half  of  the  s  plane  until  the  subject  or 
operator  can  no  longer  maintain  control.  The  value  of  X  at  the  onset  of 
instability  is  called  the  critical  instability  score,  X  .  No  input  is 

required  since  operator  remnant  serves  to  excite  the  system  . 


3 


Subcritical  task  (first-order)  refers  to  a  similar  tracking 

situation  in  which  the  value  of  the  \mstable  pole,  X,  is  kept  at  a 

constant  and  controllable  value,  X  ,  throughout  the  run.  In  subcritical 

s 

tracking,  a  random  appearing  input  is  usually  applied.  Figure  4  is  a 
block  diagram  representing  the  critical  and  subcritical  systems. 

II.  EXPERIMENT 

A. )  Procedure 

Fourteen  subjects  were  chosen  for  the  experiment.  Of  these  fourteen, 
six  were  military  pilots,  two  were  civilian  pilots  and  six  were  nonpilots. 

The  basic  experimental  procedure  went  as  follows.  A  subject  performed 
the  critical  task  experiment  twenty  times  in  succession.  An  average 
critical  instability  score,  was  obtained  as  the  mean  of  his  five 
highest  scores.  Five  subcritical  systems  were  then  chosen  with  pole 
locations  given  by: 

X^  =  1  •  i  =  1,  2,  3,  4,  5 

i 

The  subject  made  ten  runs  of  fixed  duration,  in  succession,  for  each  of 
these  systems.  After  each  set  of  ten  runs,  the  subject  was  asked  to 
rate  the  system  as  per  the  instructions  of  Figure  5-  The  five  subcritical 
systems  were  ordered  randomly  and  this  random  order,  once  selected,  was 
reversed  for  every  operator.  This  means  operator  1  tracked  the  subcritical 
systems  in  the  order:  X  ,  \  ,  \  ,  \  ,  \  ,  while  for  operator  2  the 

S3  Sg 

order  was:  X  ,  X  >  X  ,  X  ,  X  ,  etc. 

^2  ^5  %  ^1  ^3 

The  Measurement  Systems  Inc.  isometric,  finger  grip  manipulator 
was  utilized  for  the  study.  The  system  error  was  displayed  to  the 
operator  as  the  displacement  of  a  horizontal  line  on  an  oscilloscope 
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screen.  The  system  dynamics,  input  and  mean  square  error  circuits  were 


mechanized  on  a  small  analog  coii5)uter.  Table  I  siominarizes  the  experimental 
setup.  Figure  6  shows  the  layout. 

B.)  Discussion 

The  parameters  of  Table  I  were  selected  to  coincide  as  nearly  as 
possible  with  those  of  similar  experiments  conducted  by  Systems  Technology 

n 

Inc.  (STi)  .  Due  to  equipment  limitations,  the  sum  of  only  two  sinusoids 
was  used  as  an  input  for  the  subcritical  task.  Their  magnitudes  and 
frequencies  were  chosen  to  coincide  with  those  of  the  two  lowest 
frequency  sinusoids  used  by  STI.  Were  the  controlled  element,  Y^(s), 
stable,  the  sum  of  just  two  sinusoids  would  probably  not  appear  random 
enough  to  ensure  compensatory  behavior.  However,  the  open  loop  instability 
made  it  very  difficult  for  the  operator  to  utilize  anything  but  error 
information  in  tracking. 

In  view  of  the  large  number  of  runs  in  a  single  experiment  (20  critical 
+  50  subcritical  =  70  runs)  it  was  decided  to  reduce  the  subcritical  run 
lengths  from  an  original  100  seconds  to  50  seconds.  Early  experiments 
with  the  100  second  lengths  resulted  in  considerable  operator  fatigue  and 
poor  performance.  The  shorter  r\m  lengths,  however,  probably  decreased 
the  accuracy  of  mean  square  error  scores. 

A  brief  comment  on  the  rating  instructions  of  Figxire  5  is  in  order. 

At  no  time  was  the  subject  explicitly  instructed  to  associate  a  particular 
scale  value  with  a  particular  system.  In  addition,  each  time  the  subject 
was  asked  to  evaluate  a  system,  he  was  given  a  clean  rating  sheet. 
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III.  RESULTS 


Figiire  7  summarizes  the  experimental  results.  A  set  of  typical 
time  histories  is  showi  in  Figure  8.  Table  II  gives  the  performance 
and  ratings  of  the  fourteen  test  subjects.  The  error  scores  for  the 
first  four  subjects  were  deleted  since  poor  analog  scaling  caused  these 
values  to  be  inaccurate. 

The  correlation  coefficient  for  the  rating  vs.  data  is  0.73 

as  shown  in  Figure  7*  The  mean  ratings  are  seen  to  fall  quite  close  to 
the  regression  line.  Regression  analysis  of  ratings  vs.  performance  was 
hampered  because  of  the  fact  that  in  five  of  the  subcritical  configurations 
the  operators  lost  control  in  at  least  eight  of  the  ten  r\ins.  It  was 
difficult  to  quantify  this  performance  and  relate  it  to  that  obtained 
when  control  was  maintained  for  the  full  50  seconds.  Hence  no  further 
analysis  of  the  error  scores  beyond  that  shown  in  Table  II  has  been 
presented. 


IV.  CONCLUSIONS 


a.  )  It  does  appear  that  the  human  operator  can  transpose  his 
iii5)ressions  of  a  system  directly  to  a  linear  numerical  index.  The  lack 
of  adjectives  does  not  appear  to  detract  from  the  operator's  ability  to 
generate  subjective  opinion. 

b. )  The  ability  of  the  subject  to  utilize  the  linear,  nonadjectival 
scale  does  not  appear  to  depend  upon  previous  experience  with  rating 
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scales  in  general.  The  test  subjects  ranged  from  the  decidedly  non¬ 
technical  (the  author’s  wife)  to  Navy  carrier  pilots  in  the  Department 
of  Aeronautics. 

c.)  The  scale  appears  reasonably  sensitive,  i.e.  the  mean  ratings 

are  seen  to  range  from  2.9  to  8.4  (55^  of  the  rating  scale)  as  X/X^ 

ranges  from  l/6  to  5/6  (66.7^  of  X/X^  scale).  The  standard  deviations 

of  the  ratings  are  fairly  uniform  across  the  X/X^  scale.  This  indicates 

constant  sensitivity  along  the  rating  scale  which  is  a  characteristic  of 

3 

the  psychological  continuum  . 

It  must  be  emphasized  that  the  rating  scale  investigated  here  is 
not  offered  as  a  replacement  for  the  highly  successful  Cooper-Harper 
scale.  This  should  be  obvious.  However,  there  may  arise  instances 
when  one  desires  to  detect,  in  a  relative  sense,  minor  changes  in  system 
acceptability.  In  such  instances,  adjectival  scales  are  simply  not 
appropriate  since  they  lack  the  necessary  sensitivity  or  overteix  the 
operator’s  resolution  capability.  In  these  cases,  a  scale  such  as  the 
one  investigated  here  may  prove  useful. 
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Figiire  2 


Favorability  of  Handling  Qualities 
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—  Good 

—  Fair 

—  Poor 

—  Bad 

—  Nearly  Uncontrollable 


8  □  Uncontrollable 


A  Global  Rating  Scale  for  Handling  Qualities  Evaluation 
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Figure  3 


Increasing  Difficulty 
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Schufeldt's  Nonadjectival,  Nonordinal,  Linear  Rating  Scale 
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Figure  U  Critical  and  SubcriticaJ.  Tracking  Tasks 


SUBJECT 

DATE 


The  critical  task  provided  information  regarding  the  limits  of 
your  ability  to  control  an  unstable  system.  Using  the  scale  below, 
indicate  the  degree  of  difficulty  you  encountered  in  controlling  the 
subcriticaJL  system  checked.  All  the  systems  you  will  be  asked  to  rate 
in  this  manner  will  be  unstable. 

Increasing  Difficulty  o 

0123456789  10 


System  1 
System  2 

System  3 
System  4 
System  5 


FigTore  5  Rating  Sheet  for  Subcrltical  Task 
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Figure  6  Tracking  Task  Equipment  Layout 
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Figure  7  Ratings  vs.  X/X 


r  HUME  MTS  DIViSlON  OF  CLiviTf  CORPORj 
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Critical  Instability  Score  =  3'62 


Figure  8  Critical  Task  Stick  Output  and  Error  Signals;  Subject  l4 
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Figure  8  cont'd.  Subcritical  Task  Input,  Stick  Output  and  Error  Signals; 

Subject  l4 
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3  secs 


=  5/6 
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Figure  8  cont'd. 


Subcritical  Task  Input,  Stick  Output  and  Error  Signals 
Subject  l4 
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TABLE  I 


Critical  and  Subcritical  Task  Parameters 


X  =  +  Xt  (Critical  Task) 


X  =  1.0  rad/sec 
o  ' 


\  =  0.1  rad/sec‘^ 


=  control/display  sensitivity 


=  0,9  cm  scope  deflect ion/newton  stick  force 
Kjj  =  display  viewing  gain  for  50  cm  nominal  viewing  distance 
=  1.0  degree  visual  angle/cm  display  deflection 
i(t)  =  input  (Subcritical  Task) 

=  0.494  sin  0.502  t  +  0.460  sin  1.256  t  cm 


2 

i  (t)  =  mean  square  input 


=  0.23  cm 
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TABLE  II 

Experimental  Resiilts  -  Ratings  and  Performance 


\\/\^  =  1/6 

2/6 

3/6 

4/6 

5/6 

X 

c 

SubJ . 

Ratinc.^^ 

1 

3.5 

5.5 

5.0 

7.5 

9.0 

4.77 

2 

2.0 

1.0 

5.0 

6.5 

9.0 

4.06 

3 

0.5 

1.0 

1.5 

3.0 

3.5 

3.96 

4 

1.5 

2.5 

2.1 

3.7 

6.5 

4.18 

5 

3.0 

.096 

4.0 

.167 

7.0 

.488 

5.5 

.453 

10.0 

1.689 

4.47 

6 

2.5 

.331 

4.0 

.270 

5.0 

.810 

7.0 

1.215 

7.5 

1.662 

3.o4 

7 

2.5 

.410 

3.0 

.611 

4.5 

.871 

8.0 

10.0 

4.58 

8 

2.8 

.472 

6.5 

.500 

4.0 

.993 

7.0 

1.660 

9.5 

4.28 

9 

3.3 

.120 

1.8 

.182 

6.5 

.521 

7.0 

.484 

8.8 

1.017 

4.23 

10 

1.5 

.031 

4.0 

.117 

5.75 

.337 

4.5 

.202 

7.0 

.821 

4.23 

11 

4.6 

.o4i 

6.1 

.187 

5.5 

.406 

7.3 

.352 

9.2 

1.802 

4.75 

12 

3.0 

.080 

5.0 

•'+33 

4.0 

.193 

4.5 

.401 

8.0 

4.25 

13 

6.0 

.060 

7.0 

.094 

7.5 

.392 

9.9 

.286 

10.0 

.811 

4.26 

14 

4.0 

.286 

7.0 

.337 

6.0 

1.050 

8.0 

1.531 

9.0 

3.81 

-  Indicates  Subject  Lost  Control  in  at  Least  Eight  of  Ten  Runs 
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